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Abstract 

Molecular communication is a biologically-inspired method of communication with attractive properties for 

OO 

microscale and nanoscale devices. In molecular communication, messages are transmitted by releasing a pattern 

O ! 

, of molecules at a transmitter, which propagate through a fluid medium towards a receiver. In this paper, molecular 

O ■ communication is formulated as a mathematical communication problem in an information-theoretic context. 

<D ■ 

^"*^ 1 Physically realistic models are obtained, with sufficient abstraction to allow manipulation by communication and 

OO ' information theorists. Although mutual information in these channels is intractable, we give sequences of upper 

and lower bounds on the mutual information which trade off complexity and performance, and present results to 
illustrate the feasibility of these bounds in estimating the true mutual information. 
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I. Introduction 



> 

t^j- ' At the scale of microorganisms, which have dimensions on the order of 10 meters, biological commu- 

m ! 

l/-) . nication is often carried out using molecules; for example, it is well known that chemical signals are used 



\ in communication and control among cells in living tissue. As another example, some species of bacteria 
send chemical messages to their neighbors in a strategy known as "quorum sensing", which allows them to 
estimate the local population of their own species [1]. Given these examples, molecular communication [2] 
has been proposed as a biologically inspired solution to the problem of communicating among microscale 
or nanoscale devices. Molecular communication is an engineered form of communication, using designed 
systems to send messages from a transmitter to a receiver: the transmitter sends a message by releasing a 
pattern of molecules into a shared medium, while the receiver detects the the message by observing the 
arriving pattern of molecules, similarly to the related biological systems. The objective of this paper is 
to provide an analytical basis for molecular communication by providing physically realistic models and 
methods for bounding on information rate. 

Molecular communication has attracted considerable attention from researchers across several disci- 
plines, including electrical engineering, microbiology, chemistry, and biomedicine. However, unlike most 
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conventional forms of communication, virtually all of the molecular communication results in the literature 
are experimental rather than analytical. There are many examples of systems that have been implemented 
in laboratory settings: for example, recent successes in the new field of systems biology [3], in which 
microorganisms are engineered and designed to perform specific tasks, have been exploited to produce 
molecular communication systems. Fundamental work in this direction was done by Weiss [4], [5], who 
adapted chemical pathways in microorganisms to construct simple "circuits" that communicate with 
each other, such as logic gates. Other researchers have focused on biological components that can be 
exploited in molecular communication. For example, to communicate between two engineered cells that 
are in direct contact, gap junctions (i.e., portals through two adjacent cell membranes) may be used to 
pass message-bearing molecules; a communication system using this principle was described in [6], [7]. 
Another technique packages message-bearing molecules in a small container known as a vesicle, and 
conveys them along a filament connecting two devices using a molecular motor [8], [9]. Since these two 
techniques require a connection between communicating devices, either direct contact or via a filament, 
they are analogous to wired communication. On the other hand, molecules may propagate in free space 
between the transmitter and receiver via Brownian motion, such as in the system proposed by [10]; this 
requires no connection and is analogous to wireless communication. Extensive experimental work has 
been conducted into molecular communication and related methods, and the references listed above are 
only a representative sample of that work. 

Meanwhile, very little work has been produced to provide an information-theoretic or communication- 
theoretic analysis of these channels. Berger [11] has presented a related idea known as "living information 
theory", where the goal was to analyze biological systems using information-theoretic tools. Work has 
also been done to find the capacity of the so-called chemical channel, also known as the trapdoor channel 
[12], [13]. This model captures many interesting features of the molecular communication problem, but 
it is inadequate: for instance, as we show in Section IIV-B[ for any finite number of balls in the bin, there 
exist sequences of molecule arrivals that occur with Pr > in practice, but that have Pr = using the 
trapdoor channel. In terms of related work, similarities exist with timing channels, such as the queue 
channel [14], [15]. 

Our main results are concerned with mathematical modeling of molecular communication, and information- 
theoretic performance bounds; these results are summarized as follows: 

1) Modeling. We present physically realistic models for the propagation environment, as well as 
an ideal model of the transmitter and receiver in any molecular communication system (Section 
lUl). These models provide a level of abstraction such that they may be used by information and 
communication theorists, who may have no particular background in chemistry or biology, to 
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analyze and design molecular communication systems. We show that our ideal model is information- 
theoretically meaningful, in that it provides an upper bound in terms of mutual information for any 
alternative model (Theorem Q]). Further, we show that any system with distinguishable molecules 
can be separated into statistically independent systems (Theorem |2]), so that from an information- 
theoretic perspective, it is sufficient to solve the case where molecules are indistinguishable. 
2) Performance bounds. We give methods to find bounds on achievable information rate in molecular 
communication systems. When the input distributions are unconstrained, we show that the achiev- 
able rates are infinite (Theorem [3]). For constrained input distributions, the mutual information is 
intractable, so we provide a sequence of both tractable and achievable lower bounds (Theorem HI 
reused from [16] and elsewhere), based on a straightforward and extensible approximation of the 
channel (Section IIV-DI) . We also provide a sequence of tractable upper bounds (Theorem [5]). Both 
sequences of bounds provide a natural tradeoff between performance and complexity. Results are 
obtained to illustrate both the lower and upper bounds, which are given in (Section [VTl). 
Throughout this paper, we focus on molecular communication systems using free-space diffusion, but 
our models can be easily generalized to a wide variety of alternative scenarios. With these results, our 
hope is to generate interest in molecular communication from information theorists, and to inspire further 
research in this emerging field. 

The remainder of the paper is organized as follows. In Section UH we outline our channel model and 
provide some useful modeling results. In Section [Till we give some simple results showing that information 
rates are infinite unless the input distribution is appropriately constrained. In Section [TV] we give some 
nontrivial and achievable lower bounds on information rate, while in Section |V] we give upper bounds. 
Finally, in Section |VT1 we give results illustrating the usefulness of these bounds. 

II. Model 

In this section, we present a formal and physically realistic mathematical model for molecular com- 
munication based on free-space diffusion and Brownian motion. This model has useful properties for 
information theoretic analysis: first, it provides a layer of abstraction such that researchers with no 
background in chemistry can analyze molecular communication; and second, as we show in Theorem 
CD the model is information-theoretically ideal, in the sense that relaxing our modeling assumptions leads 
to a system with smaller mutual information. 

A. Brownian motion as a communication medium 

Brownian motion refers to the random motion of particles, which may be individual molecules, as a 
result of random collisions and interactions with molecules in the environment. There exists an extensive 
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Fig. 1. The molecular communication system, with a point source transmitter, separated from the receiver by a distance d. 

body of literature concerning Brownian motion as a stochastic process, which has applications in physics 
and beyond; the reader is directed to [17] for an introduction. 

To see how Brownian motion may be used as a communication medium, suppose a point-source 
transmitter, at the origin, and a receiver, located distance d away, are connected by a fluid medium, 
as depicted in Figure [TJ For convenience, we assume a one-dimensional medium, but this is not essential 
to the remainder of the paper; further, as shown in the figure, the 1-d model is practical for a receiver 
that can be viewed as a plane (e.g., for a cell that is in close proximity to the transmitter). The transmitter 
has a message w E W, where W is the set of all possible messages. The transmitter conveys w to the 
receiver by releasing a pattern of molecules into the fluid medium. The receiver observes the arrivals of 
the molecules, and from the pattern of arrivals, guesses that w' was the message sent by the transmitter. 
As in any communication channel, if w = w', the transmission is successful; if w ^ w', an error is made. 

For the remainder of the paper, we make three modeling assumptions about the Brownian motions and 
physical properties of the molecules: first, that they are Markovian (i.e., given the molecule's position and 
physical state in the present, its motion in the future is conditionally independent of its motion in the past); 
second, that molecules do not react or otherwise change in transit; and third, that they are statistically 
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x=(m,T) r x (t) = (m,B(t)) y = (m,s,T') 



Fig. 2. Transmitter, channel, and receiver, with associated quantities. 

independent for different molecules. 

As depicted in Figure [21 a particular molecule undergoes three processes: transmission, propagation, 
and reception. Here we define these processes in full generality; our later assumptions and results will 
allow us to simplify some of these expressions. 

Letting Ai represent the set of molecules that are available within the system, a transmission x is 
defined as the pair 

x = (m,r) G M x R, (1) 

representing the release of molecule of type m G M. at time r G R. 

Letting B represent the set of possible Brownian motions B{t), where B(t) is the position of the 
Brownian motion as a function of time, the propagation of the transmission x is defined as the double 

r x (t) = (m, B(t)) G M x B (2) 

representing the motion B(t) G B of a molecule of type m. The propagation r x (t) is related to the 
transmission x through the initial conditions 

r x (r) = (m,B(r)) = (m,0), (3) 

since x = (m, r) is transmitted from the origin at time r. 

Let S represent the set of relevant physical states of a molecule (e.g., velocity) that the receiver is 
capable of measuring. A reception y is defined as the triple 

y = (m, s, t') G M x S x E, (4) 

representing the observation at the receiver of molecule m G M, with physical state s G S, at time r' G R. 
(The need to measure physical states at the receiver is a technical requirement so that the Brownian motion 
is Markovian under general models of motion.) Recalling our assumption that the receiver only interacts 
with molecules as they cross the boundary, for each propagation r x (t), let % x it) represent the set of time 
instants such that 

% x{t) = {r[,r^...} = {T':B(r') = d}, (5) 
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that is, the set of time instants such that the molecule is located at the boundary. Let y r u) represent the 
set of receptions corresponding to the propagation r x (t). Then 

y rx (t) = {?/o, Vi, ■■■} = {y.r' e T rx(t) ,y = (m, s, t')} (6) 

is the set of receptions corresponding to % x (t). Further, y rx (t) is the set of observations corresponding to 
the transmission x. 

In the sequel, we will represent the sequences of transmissions and receptions of all molecules as vectors, 
writing x = [x±,X2, ■ ■ ■] and y = [2/1 , 2/2 , • • •] as vectors of transmissions and receptions, respectively. 

B. Ideal model for transmitter and receiver 

In the previous section, we assumed that the molecules within the channel obeyed three assumptions, 
which we will justify later as physically realistic. However, to model the transmitter and receiver, we take 
a different approach: we use a mathematically convenient model, and show that this model is ideal in the 
sense that relaxing these assumptions leads to systems with lower mutual information. As a result, we 
obtain a model that is simple, that abstracts away the details of chemical processing at the terminals, and 
that is a valid upper bound on performance for any implementation of transmitter and receiver. 

We define three properties of a general ideal model for molecular communication, as follows: 

1) No control error at the transmitter. At the transmitter, we assume that the relevant properties of 
every transmitted molecule can be controlled exactly. 

2) No measurement error at the receiver. At the receiver, we assume that the relevant properties of 
every received molecule can be measured exactly. 

3) The receiver absorbs arriving molecules. At the first passage time of each molecule, the receiver 
removes the molecule from the system. 

It is obvious that the first two properties are ideal, and in the following result we prove that the 
third property is ideal. Define this property formally as follows: partition the vector of receptions into 
[y < " 1 \ y*' 2 ^ where contains the receptions corresponding to the first passage times at the receiver, and 
y( 2 ) contains all other receptions. That is, for each element y± = (m^Ti), s,r-), there is a propagation 
r x {t) such that t[ = mm% x ^, and vice versa. Then y := yW, and the receiver discards y^ 2 ^. (Equivalently, 
each propagation r x {t) is terminated after its first passage time, so y^ 2 ^ is always empty). Then we have 
the following: 

Theorem 1: Let /(X; Y) represent mutual information of a molecular communication system under 
the ideal model. Then /(X; Y) = /(X; yW) = /(X; Y«, Y( 2 )). Further, letting y' represent [y (1) ,y {2) ], 
sorted in order of arrival time (i.e., without prior knowledge of which arrivals are first arrivals), then 
/(X;Y)> J(X;Y'). 
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Proof: We can first write 

J(X;Y (1) ,Y (2) ) = ii(Y (2) | Y (1) ) + #(Y (1) ) + #(X) -#(Y (2) | Y (1) , X) - i/(Y (1) , X) (7) 
= ii(Y (2) | Y (1) ) + ii(Y (1) ) + H(X) - ii(Y (2) | Y (1) ) - ii(Y (1) , X) (8) 
= I(X;Y«), (9) 

where the second equality follows since the Brownian motion is Markovian, by assumption. However, 
since yW = y, we have that 

7(X; Y) = i(X; Y«) = i(X; Y (1) , Y (2) ). (10) 

Since y' is formed by processing [yW, y^], i(X; Y^\Y^ 2 ') > i(X; Y') follows from the data processing 
inequality, and so the theorem follows. ■ 

Intuitively, the theorem states that all the information in the molecules is in the first arrivals yW. If 
the receiver knows which are the first arrivals (e.g., by removing a molecule after its first arrival, or by 
somehow "tagging" the molecule as having been observed already), it can safely ignore any subsequent 
arrivals of the same molecule. Meanwhile, taking away the receiver's knowledge of which are the first 
arrivals can only hurt the mutual information. 

We give the following example of a system that violates one of the ideal model assumptions, but that 
is nonetheless practically important, and will be used in subsequent sections: 

Example 1 (Counting detector): In the counting detector, time is partitioned into segments of length 
T. The detector forms a sequence C = [ci, C2, . . .] of integers, where Cj represents the count of arriving 
molecules on the interval [jT, (j + 1)T). (If there is more than one type of molecule in the system, Cj 
represents the counts of each type of molecule). More formally, 

Cj = \{Vj ■ Vj = (jn^Sj^JT < rj < (j + 1)T}\. (11) 

As in the ideal case, we assume that the transmitter is error-free, and that the receiver absorbs arriving 
molecules. However, the measurements taken by the receiver contain quantization error, since the arrival 
times are quantized to the intervals [jT, (j + 1)T). Thus, it is clear that the counting detector has lower 
mutual information than the ideal model. (End of example.) 

C. Statistical model of the channel 

Consider a system employing the ideal model from Section III-BL and suppose a single molecule is 
transmitted. Then if x = (m, r), the molecule will arrive at the receiver as 

y=(m,x,7 J ) = (m,s,T + n t ), (12) 



s 



where n t is the first passage time at the receiver for the Brownian motion. Further, from Theorem \T\ 
measurements taken at the first passage time contain all the relevant information that the receiver needs 
about the molecule. 

Assuming that information is carried in the release time r, and disregarding the state s, the first passage 
time n t may be viewed as additive noise. In particular, suppose x G (M. x M) n is a vector of transmissions, 
and suppose y G (Ai x S x IR) n is the corresponding vector of receptions, considering only the first 
passage times at the receiver. For each element X{ = (mj,Tj) of X, there is a corresponding reception 
Uj = (m„ Si, Ti + n t j), with corresponding first passage time n tji , of Y. However, it is not necessarily 
true that i = j, as the random delays n t :i may cause the molecules to arrive out of order, and the receiver 
does not generally know the order of transmission. Letting u = [ui, u 2 , ■ ■ .], where Ui = {m^ Sj, t{ + n t i ), 
we generally observe 

y = S ort T (u), (13) 

where the function sort r : (Ai x 5 x K)" (Ai x 5 x 1)" sorts a vector of receptions in increasing 
order of arrival time. As a result, y are the order statistics of u, sorted with respect to r/. 

If w is an n-dimensional vector random variable with independent and non-identically-distributed 
elements, and z are the order statistics of w, then the Bapat-Beg theorem [18, Theorem 4.1] states 
that 



Pr(Zi < zi, Z 2 < z 2 , ■ ■ ■ , Z n < z n ) 



( 



per 



F Wl (z x ) F Wl (z 2 ) - F Wl (zi) ■■ 
F W2 (zi) F W2 (z 2 ) - F W2 {zi) ■■ 



Fw 2 { z n) — Fw 2 ( z n-l) 



(14) 



\ L F w n ( z i) F Wn (z 2 ) - F Wn (zi) ■■■ F Wn (z^ - F Wn (z n -i) J / 
where Fyy.(-) is the cumulative distribution function (CDF) for the ith element of w, and where per(-) 
represents the permanent of a matrix [19]. For annxn matrix M = [my], per(M) is given by 

n 

per(M) = 11™^)' (15) 

neVn 1=1 

where V n is the set of all permutations of {1, 2, . . . , n}. For our system, we show in Appendix lAl that the 
probability density function (PDF) /V|x(y|x) is given by 



/Yix(y|x) = per 



/Yi|Xi(2/iki) /y 2 |xi(j/2|a?i) •■■ fY n \Xi(y n \xi) 

/nixad/il^) fr 2 \x 2 (y2\x 2 ) ■■■ f Yn \x 2 (yn\x 2 ) 



fy^iyiM fY 2 \x n {V2\x n ) ■■■ fY n \x n (y n \x n ) J / 



(16) 
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where fy^XjiVilxj) represents the probability that arrival corresponds to the transmission Xj. If yi = 
(rrii, Si, t-) and Xj = (rrij, tj), then we have that 

fYi\Xj{yi\xj) = fM j \M i {rnj\m)fs i ,'ii\T j {su 7 i\' r 3) 

= A(m J -,m i )/ s . i3 v| Ti (5 ij T/|r i ), (17) 

where A(m J -, m») represents the Kronecker delta function, which appears since (by assumption) molecules 
do not change in transit. From (fT6l) and (fTTT) . /s i) T.'|T 3 -(si ) is sufficient to completely characterize the 
system. 

In spite of its superficial similarity between the form of (TT5T) and the calculation of the determinant, 
the permanent is a member of a class of problems, known as #P-complete[] which are believed to be 
intractable [20]. The fastest known algorithm for calculating the permanent of an n x n matrix, given 
in [21], has 6(n2 n ) complexity; while most bounds and approximations have either low accuracy or 
relatively high complexity (e.g., [22]). Obviously, these facts significantly complicate the calculation of 
the information rate of this channel. Although exact calculation of the information rate appears to be 
difficult for any large system, in later sections we will give approaches for bounding the information rate. 

D. Separation into parallel channels for distinguishable molecules 

We now give a useful result that simplifies our analysis in the remainder of the paper: namely, that 
distinguishable molecules can be separated into independent parallel channels. 

For each i E Ai, let and yW represent the vectors of transmissions and receptions, respectively, 
corresponding to molecule i. Then we have the following: 

Theorem 2: / Y |x(y|x) = Y\ ieM / Y (*>|x<o(y (0 I x«) . 

Proof: Let Ai = {m 1 , m 2 , . . . , m\ M \}. Suppose x and y are rearranged so that 

x= [ X W )X W ...,x^] , (18) 

and 

y= [y^y^,...^™'^]. (19) 

Since row and column permutations do not affect the permanent, this rearrangement does not affect 
/v|x(y|x). Let H = [fuj], where h i:j = fY^x^Vilxj); then from (HU), / Y |x(y|x) = per(H). Furthermore, 
for each m a ,m b e M, let H.^™*) = [h^ a ' mb) }, where h\™ a ' mb) = fY^iyt^l^)- Th en, given the 

'#P-complete is pronounced "sharp-P complete". 
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rearrangement in ([T8l)-([T9l), 



H 



Jj(mi,mi) Jj(mi,m 2 ) 
Jj(m 2 ,mi) Jj(m 2 ,m 2 ) 

Jj(m| M ,mi) Jj(m|Al|. m 2) 
Jj(mi,mi) q 

Jj(m2,rrt2) 



Jj(mi,m| M |) 
Jj(m 2 ,m| M |) 

Jj(m| M |,m| M |) 






(20) 



(21) 



• • • \{( m \M\, m \M\) 

where represents an all-zero matrix of the appropriate dimension; the all-zero matrices in the second 
equality follow from the delta function in (flTT) . Since H is a block-diagonal matrix, a property of the 
permanent is that [23] 

per(H) = \\ per (H (M) ) , (22) 

ieM 

and the theorem straightforwardly follows. ■ 
Example 2: Suppose that every molecule is distinguishable, that is, there is exactly one element each 

in x ( mi \x( m2 ), . . ., x Hmi) and y( m ^\y( m ^\ . . . ; y(™|M|). Then $2$ becomes 

{mi)\ x {mi)\ q ... q 

o fY\x(y {m2) \x (m2) ) ■■■ o 



H 



o o •■■ f Y \x(y { '" 1 



m |M|)|o-( m lW 



(23) 



dropping the boldface notation from x^ m ^ and y^ m ' l \ as they are scalars in this example. Since d23l is a 
diagonal matrix, it can be easily verified that 



\M\ 



/v|x(y I x) = per(H) = JJ f Y \x(y 



( m \M\) I x i m \M\)^ 



(24) 



which is equivalent to the input-output PDF of any independent additive noise channel. 

(End of example.) 

As a result of Theorem [2l for any system where | JA\ > 1, we can separate the problem into independent 
and identical parallel channels corresponding to each type of molecule in M.. Thus, the general information 
rate problem reduces to the problem of finding information rates when \M.\ = 1, i.e., when all the 
molecules in the system are indistinguishable. 
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E. The Wiener-Ideal Model 

If the Brownian motion is modeled by the Wiener process [17], then there are no relevant physical 
states (i.e., S = 0), and it can be shown that 



Notice from (1251) that fN t ( n t) has a very long tail, which decays with ' ; neither the mean nor the 
variance of this distribution are finite. The two parameters of the distribution are a 2 , the intensity of 
the Brownian motion; and d, the distance from the transmitter to the receiver. There exist other models 
for Brownian motion, such as the Ornstein-Uhlenbeck process, where the only relevant physical state is 
velocity (i.e., S = K); but in that case (and in most cases other than the Wiener process), fN t ( n t) cannot 
be expressed in closed form. 

Suppose the ideal model from Section Hl-Bl is used. If the Brownian motion is modeled by the Wiener 
process, we can simplify (IPTl) to 



Further, bearing in mind Theorem El for each m E M, we can disregard the A(rrij, rrij) term and assume 
that m is the only type of molecule in the channel. We can now write 



so that the system is fully characterized by the first passage time distribution. Thus, the transmissions are 
all of the form x = (m, r), and the receptions are all of the form y = (m, , r') (noting the blank in place 
of s, since 5 = 0); the only task of the transmitter is to set the departure time r, and the only task of the 
receiver is to measure the first passage time r'. We call this straightforward model the Wiener-Ideal (WI) 
model. 

We now consider whether our modeling framework in general, and the WI model in particular, are 
physically realistic. We began with three fundamental assumptions - that Brownian motion is Markovian, 
that the motions are statistically independent, and that molecules don't change in transit. The first two 
assumptions are common in the diffusion literature (e.g., see [17]). The third assumption depends on the 
type of molecule in use, but there certainly exist molecules that are stable over the relatively short time 
scales we consider. Furthermore, it is appropriate to assume that the motions are statistically described by 
the Wiener process, so long as the Brownian motion is nearly free of friction [24]. If friction is significant, 
this assumption can be relaxed by substituting the first arrival time distribution for the Wiener process with 
the appropriate distribution; none of our subsequent methods depend specifically on the Wiener process. 
Finally, we have the ideal modeling assumptions for the transmitter and receiver, which are less realistic, 




(25) 



(26) 



fy^XjiVilxj) = fN t {r--Tj) 



(27) 
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but which we proposed as a way of deliberately trading off realism in order to obtain abstraction. Further, 
as we saw in Theorem [H these modeling assumptions are meaningful in an information-theoretic sense. 
As a result, we may conclude that the WI model is physically realistic, and for the remainder of the paper, 
we will deal exclusively with this model. 

III. Some simplified views of information rate 

Under the WI model, we showed that /Y|x(y| x ) is intractable for a realistic number of molecules. 
However, we can nonetheless give some simplified calculations of capacity, which will demonstrate the 
importance of constraining the input distribution. Note that in this section, and for the remainder of the 
paper, logarithms are assumed to be base 2 unless otherwise stated. 

We assume the WI model, but the results are valid for any first passage time distribution where f^ t ( n t) > 
for any n t > 0, and where lmin^oo J Q nt f^^du = 1 (i.e., the molecule arrives in finite time with 
probability 1). Let x and y represent n-fold vectors of transmissions and receptions, let /x( x ) represent 
the input distribution, and let Ts represent the total observation time for the system. Then we can state 
the following result. 

Theorem 3: For any n, 

hm max = oo. (28) 

7s-oo/ x (x) n 

Further, for any Ts, 

hm max — — = oo. (29) 

«-»°°/x(x) T s 

Proof: In both cases, the statement is proved by finding special cases in which the information rate is 
infinite. 

To prove the first statement, suppose T s is partitioned into intervals of duration log Ts, and suppose 
/x(x) is chosen so that all n molecules are released at the beginning of a single interval. As T$ — ► oo, 
logTs — > oo, so the probability that the molecules arrive in the same interval is 1 (by assumption). Using 
this method, log log Tg bits can be transmitted without error. Since log log T s — > oo as T s — > oo, then for 
each n, limT s ^ooniax/ x(x ) /(X; Y)/n = oo. 

To prove the second statement, we take the same strategy as in [15] for infinite- server queues, which 
we restate here for completeness. Divide T s into intervals of size v, where /x( x ) is again chosen so 
as to release all n molecules at the beginning of this interval. The receiver's strategy is to wait for the 
first molecule to arrive, and decide that it was transmitted in the same interval in which it arrived; all 
remaining molecules are ignored. Since /Ar(n) > for all n > (under the WI model), as n — > oo the 
probability of error in this strategy goes to zero. Thus, logT^/V bits can be transmitted using this strategy, 
and as v — > 0, logT^/V — > oo. Thus, for any T s , lim n ^ 00 maxj x ( x ) /(X; Y)/T s = oo . ■ 
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The strategies to achieve these two results - waiting infinitely long and releasing an infinite number of 
molecules at once, respectively - are impractical because both time and molecules are costly resources 
in molecular communication. Furthermore, physical limitations (such as saturation) may prevent these 
strategies. However, Theorem [3] does give us some intuition concerning the behavior of our system in 
practical settings: for example, we expect that the number of bits per molecule should be high when 
molecules are sparse. In the remainder of the paper, we attempt to find information rates per unit time, 
or per molecule, given appropriate constraints on the input distribution. 

IV. Achievable lower bounds on information rate 
A. Bounds from approximate distributions 

We know that mutual information can be written 



/(A-;Y)= T limI B 



, /v|x(y I x) 

log- 



(30) 



My) 

so long as the limit exists. Furthermore, it is easy to generate instances of x and y, so any expectation of 
these variables may be tractably obtained using Monte Carlo methods. Unfortunately, in (1301 ), the function 
/v|x(y I x)// Y (y) is itself intractable. 

Instead, suppose that there exist tractable approximations g(y | x), and g(y) for /Y|x(y I x ) and /Y(y), 
respectively, which have the following properties: 

1) J y g(y | x) = 1 and g(y | x) > for all x, y (i.e., g(y | x) is a valid probability density function); 
and 

2) Given the true input distribution /x(x), g(y) is found by J x (?(y | x)/ x ( x ). 

If the approximations g(y | x) and g(y) are sufficiently good, a close approximation to I(X;Y) might 
be found by substituting g(y | x)/g(y) in place of /V|x(y I x )//v(y) in (1301) and using Monte Carlo 
expectation. 

In fact, the approximation is a lower bound on the true I(X; Y). The following result was proved in 
[16] and elsewhere, which we restate and prove here: 
Theorem 4: Given g(y | x) and g(y) defined as above, 



i g(y I x ) 

log 



I(X; Y) > lim —E 
where the expectation is taken with respect to the true distribution of x and y. 



(31) 
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Proof: Let g(x | y) = g(y\ x)/ x (x)/(y'(y). We can rewrite the expectation in (|3T| ) as 




= H(X) + E [log 0(x|y)] 
= if (X) - if (X | Y) - £> (/x| Y (x|y) || <?(x|y)) 
= /(X;Y)-D(/x| Y (x|y) || </(x|y)) , 



(32) 



where £>(/ || g) represents Kullback-Leibler (KL) divergence. The theorem immediately follows from 



The bound in (13TI) has the interesting physical interpretation as an achievable rate for a decoder that 
assumes that g(y |x) is the correct input-output distribution. As a result, the bound in (|3Tb is an achievable 
bound, in that we could (in principle) construct a device to achieve reliable communication at the rate 
given by the bound. 

B. The trapdoor channel and trivial approximations 

As we see from Theorem HI the tightness of the bound is governed by a term related to the Kullback- 
Leibler divergence between the approximation and the true distribution, so better approximations will lead 
to better bounds. Our challenge is to find good tractable approximations g(y|x), but as we point out in 
this section, seemingly good candidates for g(y|x) can lead to trivial bounds. 

We can write 



If there exist x and y such that g(y | x) = 0, while /x,y( x , y) > and g(y) > 0, we refer to g(y | x) as a 
trivial approximation. For any trivial approximation, it is easy to see that the expectation in (1331) returns 
a value of — oo. A sufficient condition to avoid a trivial approximation is to require g(y | x) > for all 
x and y, and we will require this condition to hold for all candidate approximations in the remainder of 
the paper. 

The reader may believe that the trapdoor (or "chemical") channel [12] is a promising candidate for 
g(y | x); this channel is described as follows. Let U represent a set of symbols, and using time index t, let 
Ut G U and v t G U represent inputs and outputs, respectively; and let the multiset S t = {si, s 2 , . . . , s\$\} G 
U n represent the channel state. At time t, u t is provided to the channel, and v t is selected uniformly at 
random from the multiset {u t } U S t . (The selection probability can be generalized to something non- 
uniform, but this does not change our subsequent analysis.) Finally, we set S t +± = ({u t } U S t )\v t . This 
is commonly likened to a bag of billiard balls, where S t represents the balls already in the bag, x t 



(|32|) and the properties of KL divergence. 




(33) 
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represents a ball dropped into the bag, and v t represents a ball removed from the bag. This process also 
has a superficial similarity to chemical diffusion, which has been noted by some authors [13]. 

Unfortunately, this channel model does not satisfy the sufficient condition we gave to avoid trivial 
approximations. To see this, let u = [ux,u 2 , ■ . ■ ,u\s\+i\ and v = [vi,v 2 , ■ ■ ■ ,v\s\+i] be length- (|«S| + 1) 
sequences of trapdoor channel inputs and outputs, respectively. Say u 1 = u 2 = ■ ■ ■ = u\s\+i, and V\ = 
v 2 — ■ ■ ■ — v\s\+i, but Ui 7^ Vj for any i — 1, 2, . . . , |«S| + 1; in the billiard ball analogy, |«S| + 1 blue 
balls are dropped into the bag, and + 1 red balls are removed. Since the largest number of red balls 
initially in the bag is \S\, and no additional red balls are inserted, this scenario is clearly impossible - so 
/v|u(v|u) = 0. 

There exist a few obvious mappings from u to x and from v to y: for example, a red ball may 
represent a transmission or reception at the input or output, respectively; and a blue ball may represent no 
transmission or reception. However, for this mapping, there is no way to structure the input distribution 
so as to avoid a trivial approximation: an imbalance between transmissions and receptions, which exceeds 
\S\, is impossible in the trapdoor channel. Further, using the WI model, it is straightforward to show 
for most input distributions that the expected number of molecules in transit is infinite, so it is generally 
possible to find sequences x and y with /x,Y( x jy) > for the true distribution, but g(y|x) = under 
the finite-state trapdoor model. As a result, it is not easy to see how to avoid trivial approximations with 
the trapdoor channel, and so we do not consider that model any further in this paper. 

C. A Simple Approximate Model 

Bearing the previous section in mind, we need to find a model which provides a reasonably good, yet 
tractable, pair of approximations g(y | x) and g(y). Our goal in this section is to introduce a simple model 
that leads to useful lower bounds, which we generalize in the next section. 

The basis of our simple approximate model is the counting detector of Example [Q As in that example, 
let c = [ci,c 2 , ■ ■ ■] represent the counts in each interval. For each transmission in x = [x\,x 2 , . . .], let 
cij = 1 if the transmission Xj arrives during the interval [jT, (j + 1)T), and aj = otherwise. Then the 
number of arrivals can be written 

n 

c j = J2 a j> (34) 

Accordingly, we are actually approximating /c|x(c|x), the conditional probability of count vector c given 
transmissions x, with g(c|x). By Theorem H] this approximation provides a lower bound on the mutual 
information of the counting detector, while by Example [TJ the mutual information of the counting detector 
provides a lower bound on the mutual information of the WI model. 
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TABLE I 

Illustration of the accuracy of the Poisson approximation for background arrivals; 10 4 trials. 



k 





1 


2 


3 


4 


>= 5 


Actual 


0.6921 


0.2602 


0.0423 


0.0047 


0.0007 





Poisson 


0.6965 


0.2519 


0.0456 


0.0055 


0.0005 


3.8215 ■ 10~ 5 



For convenience, we constrain the input distribution so that transmissions only occur on the boundary 
between intervals, i.e., only at times jT, for integer j. Partition x into subvectors such that 

x = [xi,x 2 , . . .], (35) 

where x^ is the vector of transmissions that occur at the instant r = jT. We allow the vector x, to be 
empty, which corresponds to the event that there are no transmissions at time r = jT. 
To achieve computational simplicity in the approximate model, we require that 

n 

9(c I x) = Y[g(cj | xj), (36) 
j'=i 

so Cj and x, are assumed independent if i ^ j. This can be interpreted as follows: for molecules released 
at time r = jT, at the beginning of the interval [jT, (j + 1)T), the receiver only attempts to detect the 
transmissions x^ during that interval; if those Xj do not arrive during this interval, then those molecules is 
assumed to be "lost". Thus, in addition to the transmissions, there will also be spurious arrivals composed 
of these "lost" molecules. These arrivals form part of the sum in (134b . and it is known that the sum of 
independent Bernoulli random variables, whether identically distributed or not, can be approximated with 
the Poisson distribution [25]. Thus, the Poisson distribution can be used to model the additive "noise" 
from the lost molecules. In Table HI we give the empirical distribution of the background arrivals, produced 
after the transmission of 10 4 molecules under the WI model, alongside a Poisson distribution with the 
same mean. From the table, we see that the Poisson distribution provides a very good approximation for 
this process. 

We now derive g(cj | x,). A molecule released at time r arrives between time r and r + T with 
probability p a , where 



p a = fmim). (37) 

Jn t =0 

Let the function 77 (x.,) represent the number of transmissions in Xj. Furthermore, for any integer k, let 

<KM)H (38) 

0, k < 0. 
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represent the Poisson distribution with parameter A. By assumption in g(cj \ Xj), r/(xj) molecules are 
transmitted at time jT, and cj arrive on the interval [jT, (j+l)T). Suppose that k of the r](xj) transmissions 
arrive, which are distributed Bernoulli with probability p a ; then (cj — k) "lost" molecules arrive, which 
are distributed Poisson. Thus, we have 

9te I *;) = E C 1 ^)^ 1 -P^ j) ~ k ^ ~ k > A )- (39) 

k=0 ^ ' 

One may then find g(c) by marginalizing (|36l) over x, in accordance with the necessary properties of 
g(c). It is obvious that g(c\x) is a valid probability distribution. Further, from (1381) and (|39~1) . g{c t \x t ) is 
nonzero as long as p a < 1, which is true for any T : < T < oo and any A : < A < oo. Thus, for the 
given values of T and A, this approximation g{c\x) satisfies the sufficient condition from Section HV-BI 

D. Generalized Approximate Model 

We can generalize the technique in the previous section by observing the channel for the arrival of 
a particular transmission over %T seconds, for some integer i > 1. As a molecule is transmitted, earlier 
molecules may still be propagating towards the receiver, and the late arrivals of these propagating molecules 
are somewhat analogous to inter-symbol interference. Detection is accomplished by assuming a Markov 
relationship between transmissions and receptions. 

Generalizing (1371) . for nonnegative integers k, let p a A T ) represent the probability that a molecule arrives 
on the interval [kT, (k + 1)T), given that it was released at time r. This probability is given by 

Pa At) = / fN t (n t ). (40) 

We will also use the conditional probability p a ,j(T~) of a molecule arriving in the interval [kT, [k + 1)T), 
given that it was released at time r, and that it did not arrive on the interval [r, kT), given by 

() Pa At) = Pa At) (41) 

Pa,k{l ) kT / \ 

Jo fN t (n t ) 2^i =0 PaAT) 
The state of the channel at any time t = jT, written Sj, consists of the transmissions x that were 
transmitted prior to jT, and remain in transit during the interval [jT, (j + 1)T). Extending the function 
r](sj) to represent the number of transmissions contained in the state Sj, we can write 

fc, s,.s. .x ('•, | B^Bj+^x,) = < (42) 

otherwise, 

since the number of arrived molecules in the jth interval is equal to the number already in transit (i.e., 
r)(sj)) and the number added (i.e., r)(xj)), minus the number still in transit in the (j + l)th interval (i.e., 
77(sj+i)) - a sort of "Kirchoff's law" of molecules. Furthermore, since the state sequence Sj, Sj+i, . . . only 
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encodes the molecules in transit along with their original transmission time, this sequence is clearly a 
Markov chain. Refining the r)(-) function, let r) T (x t ) represent the number of transmissions in the vector 
for which the release time is equal to r. Further, let 1Z represent the set of allowed release times. We can 
then write 

Vs. s.x.is,., I s t ,x t ) = n ^^ S ^ + ^ X ^^ aj ( r )^fe)+^(^)-^^ +1 )(l -p aJ (T))^\ (43) 

defining (?) = if b > a, and (?) — 1 if a = b = 0. Equation (143b is obtained since each molecule has 
a certain probability of arrival in the jth interval that is dependent on its departure time, so arrivals of 
molecules transmitted at the same time have the binomial distribution. As a result, letting S = [s , Si, . . .], 
we can write 

n 

/c,s|x(c, S I x) = / So (s ) Yl /c,|s,,s J+1 ,x, (cj I sj, s j+1 , x ; )./s ., s .x. (sj+i I Sj, Xj), (44) 

3=0 

where /s ( s o) represents the distribution of the initial channel state. 

The statistical model in (144b is not an approximation - it is the true probability of the output and 
channel state of the counting detector, conditioned on channel input. However, since the state space is 
generally enormous, there is no tractable way to recover /c|x(c | x) from /c,s|x(c, S | x). This should 
not be surprising, given our derivation of the exact model in Section III-CI However, bearing in mind our 
approximate model from the previous section, we may constrain the complexity of the state space by 
deleting molecules from the state space after a given amount of time. Similarly to the previous section, 
these deleted molecules are considered "lost", and their eventual arrivals are assumed to form a Poisson 
noise process at the receiver. Furthermore, by adjusting the amount of time that molecules are allowed 
to remain in the state space, it is possible to trade the fidelity of the model against its computational 
complexity (which scales with the size of the state space). 

Once again, for convenience, we assume that transmissions only occur at times r = kT for integer 
values of k. However, we constrain the state space so that the state Sj may only contain transmissions 
with release times more recent than (j — i)T for some integer i. Then we can use 

K = {jT, (j - 1)T, (j - 2)T, + 1)T} (45) 

in (143b . Meanwhile, fc.j\s-,s j+1 ,x (cj I Sj, Sj+i, x,) must be adjusted to account for the arrivals of molecules 
that are about to be deleted from the channel state (since these are no longer accounted for in the transition 
from s t to St+i, as they are in (142b). as well as the Poisson noise process. These issues are dealt with 
separately. First, we form an intermediate variable Wj, which counts the number of arrivals in the absence 
of the Poisson noise. For this special case, we can simplify the notation: let r)k,j = %'-fc)r(sj), and let 
Pa,k = Pa,j((j ~ k)T), i.e., the probability of arrival in the current interval i.e., the number of molecules 
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still in transit, and the probability of arrival, respectively, for molecules that were transmitted k intervals 
ago. Then wj has PDF 

J'\Vj s .s, . ,x ; ("'j I s r s / - i- x /) = 

C*- 1 *) (Pa^-^-^l - Pa^-lY, Wj = r)( Sj ) + 77( Xj ) - n(s j+1 ) - T 

< for < r < Tii- hj , (46) 

0, otherwise. 
Finally, given Wj, Cj — Wj is distributed Poisson with intensity A, with probability given in (|38|) . and the 
probability of Cj is given by marginalizing with respect to Wj, as follows: 

/c j |S j ,S, + i,X 3 - (Cj I Sj, S j+1 , Xj) = ^ ./u\S,.S,. 1 .X, (Wj I Sj, S j+1 , TCj)<f){Cj - Wj] A). (47) 

Wj 

It can be shown that this is not a trivial approximation, using a similar argument as for the simplified 
method in the previous section. 

This method gives a sequence of lower bounds on mutual information, where the bounds are enumerated 
with respect to i; furthermore, i = 1 corresponds to the simple "memoryless" approximation from Section 
IIV-CI With appropriate selection of A (which we discuss in Section IVT]) . we conjecture that the value of 
the lower bound is increasing in i, since the fidelity of the approximate model increases as i increases. 
Furthermore, so long as A — > as i — > oo, the approximate output distribution (l46l) approaches the 
true distribution (|42)) . while no molecules are "lost" from the channel state, so this approximation is 
asymptotically correct for the counting detector. Information rates found using this approximate model 
are given in Section [VH 

V. Upper Bounds on Information Rate 

As with the lower bounds, a sequence of straightforward upper bounds are also available, which are 
generated by providing side information to the decoder that makes the problem tractable. 

Consider a channel model that is similar to the WI model, but which operates in the following way: 
the channel first partitions the sequence of transmissions x into subsequences of length i, so that 

x=[x«x( 2 ),...], (48) 

where 

x (j) = [xi(j-i) +1 , x i(j _i) +2 , • • • , Xij). (49) 

(For convenience, we will assume that the length of x is a multiple of i.) Note that this partition is 
not the same as the one in (1351) . These sequences are then assigned to independent channels, and the 
receiver is made aware of which molecule passed through which channel. More formally, recalling from 
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the notation in Section III-CI that u is the vector formed by arranging the receptions in order of departure 
time, i.e., Uj is the reception corresponding to Xj. Let = [tt i (j_ 1 ) +1 , WjQ_i) +2 , . . . , My], corresponding 
to the transmissions in x®. Then in this channel, the vector of receptions y*[i] is formed by 

y*M = [y (1) ,y (2) ,...], (50) 

where 

y (j) = sort r (u (i) ). (51) 

Note that we can recover the vector of receptions y from the original (non-partitioned) channel by sorting 
y*[i], since 

y = sort T (u) = sort T (y*[z']). (52) 

Given this partitioned channel, we have the following result: 
Theorem 5: 1) If x contains n partitions of length i, then 

n 

/y[.1|x(y*[i]|x) = n/YW|xo)(y w I x w ); (53) 

and 

2) /(X; > I(X; Y) for all integers i. 

Proof: 

1) Under the WI assumptions, molecules propagating through independent channels are equivalent to 
distinct molecules propagating through the same channel, so this statement follows from Theorem 

El 

2) This follows from (|52|) and the data processing inequality. 

■ 

From part 1 of Theorem [5l so long as i is small, it is possible to tractably calculate /Y*[t]ix(y*[*] l x )> 
and thus it is possible to tractably calculate I(X;Y[i]) for small values of i (e.g., using Monte Carlo 
methods). From part 2 of the theorem, we have a sequence of upper bounds on I(X;Y), increasing in 
complexity as i increases, and approaching the true value of I(X; Y) as i — > oo. Furthermore, we can 
show the following as a corollary to the theorem: 

Corollary {to Theorem^: If i and j are integers, then I(X; Y*[ij)) < I(X; Y*[i]) and I(X; Y*[ij\) < 
I(X;Y*[j}). 

This also follows from the data processing inequality, as groups of i (or j) independent channels can be 
grouped into any integer multiple of i (or, respectively, j), and sorted into a block of ij arrivals. Thus, 
the sequence of upper bounds forms a partial order for general input distributions. We conjecture that this 
sequence of bounds is monotonic (i.e., that I(X;Y*[i]) < I(X;Y*[j]) for all i > j), but we leave the 
proof of this statement to future work. 
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VI. Results and discussion 

A. Preliminaries 

In Theorem [3l we saw that unconstrained input distributions lead to infinite capacity. To generate more 
meaningful results, we constrain the input distribution to be discrete-time and binary. In particular, we 
quantize time to intervals of length T: at the beginning of each interval, we release a single molecule 
with probability p x , or no molecule with probability (1 — p x ), where the probability of transmission at 
each interval is statistically independent. (Using our notation from Section [IV] one might also say that 
Pr(7/(xj) = 1) = p x , and Pr(r](x.j) = 0) = 1 — p x , for all j.) This is a practical constraint, and is 
analogous to a peak power constraint in a conventional communication system, as the transmitter cannot 
release more than one molecule every T seconds. We make no claim that this form of input distribution 
is optimal, and we leave the difficult problem of optimizing the input distribution to future work. 

We use the WI model to obtain all our experimental results. From (|25T) . this model has two parameters: 
d and a 2 , and we assign d = a 2 = 1 (however, since these two parameters always appear as d 2 /a 2 , any 
system with d 2 /a 2 = 1 would have the same performance). We pick these numbers only to illustrate the 
relative performances of the bounds. 

Using our approximate models from Section [IVl the counting intervals are the same as the intervals 
between transmissions. Meanwhile, we use the following method to select the Poisson "noise" parameter 

A. The probability that a molecule arrives in the i observation intervals of length T is given by 

i-X 

Pa = X)^(°)' (54) 

3=0 

where p a ,j(0) is given by (l40~l) . Thus, the probability that the molecule is "lost" is given by (1 —p a ). The 
expected number of "lost" molecules generated per interval is then p x (l —p a ). In the steady state, this 
must be the same as the expected number of "lost" molecules arriving per interval. Thus, we set 

A = Px {\ - Pa) =P x \l- J2Paj(0)j ■ (55) 

Since p a — ► 1 (and thus A — > 0) as i — > oo, this setting of A is consistent with the convergence of the 
sequence of lower bounds to the true mutual information of the counting detector. 

B. Results 

We begin with results illustrating the upper and lower bounds together, which are given in Figures |3] 
and HI showing mutual information in bits per unit time, and bits per molecule, respectively, with respect 
to p x . The "unit time" is the interval length T, which is set to T = 2.198, the amount of time (given 
d 2 / a 2 = 1) such that the probability of arrival in the first interval is p a ,o(0) = 0.5 We depict the first four 




Fig. 3. Upper bounds (UB) and lower bounds (LB) on mutual information per unit time with respect to p x . 



lower bounds and the first two upper bounds. We see a significant improvement of performance in the 
bounds moving from the first to the second bound, and all the bounds monotonically improve with order. 
Furthermore, practical information rates (especially in terms of bits per molecule) are clearly possible. As 
expected from Theorem [3l low values of p x lead to high information rates per molecule. 

Since the lower bounds are known to be achievable, whereas the upper bounds are not, we focus 
additional attention on the performance of the lower bounds. In Figures |5]and[6l we plot the information 
rate of the lower bound per unit time where the interval lengths are T = 1.068 and T = 5.390, respectively; 
for which the probabilities of arrival in the first interval are p a ,o(0) = 0.333 and p a ,o(0) = 0.667, 
respectively. (However, for fair comparison, the curves are still plotted as bits per unit of time T = 2.198.) 
In Figures [7] and [8l we also give results for the same two systems in terms of bits per molecule. From 
Figures [5] and |71 we see a significant improvement in performance from each additional order of the bound, 
which is intuitive since the arrival probability in the first interval is small; conversely, little performance 
improvement is observed beyond the second order in Figures [6] and [8] 

C. Discussion 

Our results are primarily intended to illustrate the feasibility of our upper and lower bounds in estimating 
the true mutual information in the channel. To that end, from Figures [3HH we see that our upper and 



23 




Fig. 4. Upper bounds (UB) and lower bounds (LB) on mutual information per molecule with respect to p x . 



lower bounds are reasonably successful in giving an idea of the true mutual information in molecular 
communication channels; in the worst case, the upper and lower bounds agree at least in order of 
magnitude. Nonetheless, more work needs to be done to narrow the gap between the upper and lower 
bounds. Furthermore, it is interesting that the maximizing value of p x is considerably different in the 
case of the upper and lower bounds. Meanwhile, our results from Figures [5HH] indicate that the achievable 
lower bound produces useful results which obey our intuition. 

It is natural to ask whether we expect the true mutual information to lie closer to the upper or lower 
bound. For the lower bound, we see that the gap between successive curves is smaller as the order 
increases; thus, it is reasonable to believe that the true mutual information for the counting detector is 
not much greater than the highest-order bounds we have provided. However, as we showed in Example 
[H the counting detector's mutual information is likely smaller than that of the ideal detector. For the 
ideal detector, the upper bounds (which tend to require high computational complexity to calculate) show 
a significant gap from first order to second order; thus, we cannot surmise whether the true mutual 
information is close to the upper bound. 



24 



0.25 




0.3 0.4 0.5 

Probability of transmission 



0.7 



Fig. 5. Lower bounds (LB) on mutual information per unit time with respect to p x , for T = 1.068. 
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Fig. 6. Lower bounds (LB) on mutual information per unit time with respect to p x , for T = 5.390. 
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Fig. 7. Lower bounds (LB) on mutual information per molecule with respect to p m , for T — 1.068. 
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Fig. 8. Lower bounds (LB) on mutual information per molecule with respect to p x , for T — 5.390. 
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VII. Conclusion 

In this paper, we have provided an information-theoretic and communication-theoretic basis for molecu- 
lar communication. There exists a vast body of tools and techniques in dealing with novel communication 
channels, and our modeling results provide an avenue between molecular communication and traditional 
communication theory. Furthermore, we have provided bounds on information rate for molecular com- 
munication, which are feasible for estimating the mutual information of such channels. Our work opens 
up a vast array of new problems in coding and information theory for molecular communication, such as 
the problems of finding improved bounds, optimal input distributions, and models for related systems. 

Appendix 

A. Derivation o f (17751) 

From (fl4l) . we have the CDF form of the Bapat-Beg theorem. The PDF form is derived as follows. We 
know that 

d d d 

/z(z) = ^— o — — Pr(Zi < zi, Z 2 < z 2 , ■ ■ ■ , Z n < z n ), (56) 

OZ\ OZ 2 oz n 



so from (fl4)) -(fT5l). and since per(M) = per(M T ), we can write 

d d d 

dzi dz 2 dz n 



d d d n 

/z(z) = 0-0- • • • o- F W,(l)( Z l) Y[( F W^i)(Zi) - Fw^Zi-x)) 



neV n i=2 



d d d n 



, dzi dz 2 dz,, L 



Considering each term in the permanental sum from (1571 ), we can show by induction that 

d d d 



-F WAl) (z±) \\(F WAi) ( Zi ) - Fw^Zi-x)) = Y[fwAi)( z i)- ( 5g ) 



dz\ dz 2 dz n 

1 z n i=2 i=l 

To do so, note first that -J^F Wit{1) (zi) = fw vW (zi), by definition. Now, in the inductive step, if 

d d d 



dzi dz 2 dz n _i 

i=Z 
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then 
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r> 
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d 




dz 2 


9 z n-l 
n-l 


dz. 



i=2 
n-l 

-Fw*(i){zi) J\{Fw T (i){zi) ~ Fw v {t){zi-i))F w ^ n ){z r 



(60) 



i=2 



Fw*(i){zi) X\{F Ww{i) (zi) - i 7 W 7rW (2; i _i))F l 4/ 7r(n )(2; n _i 
d 8 



n-l 



i=l 
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d 


dz n -i 
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i) 


dz, 



n-l 



i=2 



Fw*(i)(zi) J\{Fw*{t)(zi) ~ F w „(i){zi-i)) 

i- 

-F Ww (n)(Zn) 



_d_ 

OZr, 



F W7r (n)(Zn) 



~[fWn(l)( Z i), 



(61) 
(62) 
(63) 
(64) 



t=i 



where (1621) follows from the fact that the second term in (1611) is independent of z n , (1631) follows from the 
inductive hypothesis in (|59l ), and ( 1541 follows from the definition of the PDF. Thus, from (1541 ), we have 
that 

n 

fM= Ell^wW' (65) 

Tt-EVn i=i 

which is the permanent of a matrix whose (z, j)th entry is f w .(zj). 

Finally, (fT5l) follows by substituting Fwt(zj) with Fy^Xj Although ^ = (m^s^r/), and the 
order statistics y are sorted with respect to 7~j, the Bapat-Beg theorem admits order statistics where the 
ordering is with respect to a single component of a vector random variable, such as Finally, the result 
is obtained by differentiating with respect to yi, y 2 , ■ ■ ■ , y n rather than z\, z 2 , ■ ■ ■ , z n . 
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